GlusterFSのログをfluentdで集約する
はじめに
先日「GlusterFSで高可用性メールサーバを構築する | Developers.IO」という記事で、Amazon LinuxでGlusterFSを使ってファイルを分散する構成を作りました。
GlusterFSは分散ファイルシステムなのでノード同士が深い関連性を持っているのですが、この状態だとそれぞれのノードが個別にログを集積してしまい、ログからノード間の因果関係(例えば「あるノードで異常を検知した時に他のノードの状態はどうだったのか」とか)が分かりづらくなってしまいます。
こんな時はどうしたら良いのでしょう...そう、もちろんfluentdです。
今回はfluent-plugin-glusterfsというプラグインを使って、GlusterFSのログを集約したいと思います!
構成
GlusterFSのクラスタノードが2台あり、それぞれ別のAZに配置されています。それらのログを集約するログサーバを建てて、fluentdでログを転送してもらい、1つのファイルに保存します。
GlusterFSサーバの設定
GlusterFSの設定については前回の記事をご参照下さい。
まずはfluentd(td-agent)をインストールします。
$ curl -L http://toolbelt.treasuredata.com/sh/install-redhat.sh | sh
次にfluent-plugin-glusterfsをインストールします。
$ sudo /usr/lib64/fluent/ruby/bin/fluent-gem install fluent-plugin-glusterfs Fetching: fluent-plugin-glusterfs-1.0.0.gem (100%) Successfully installed fluent-plugin-glusterfs-1.0.0 1 gem installed Installing ri documentation for fluent-plugin-glusterfs-1.0.0... Installing RDoc documentation for fluent-plugin-glusterfs-1.0.0...
GlusterFSのログはデフォルトではrootのみが読み書き可能な状態になっており、fluentdから読み込めません。そこでアクセス権限を変更し、その他ユーザでも読み込み可能なようにします。
$ ls -alF /var/log/glusterfs/etc-glusterfs-glusterd.vol.log -rw------- 1 root root 1385 2月 3 07:21 2014 /var/log/glusterfs/etc-glusterfs-glusterd.vol.log $ sudo chmod +r /var/log/glusterfs/etc-glusterfs-glusterd.vol.log $ ls -alF /var/log/glusterfs/etc-glusterfs-glusterd.vol.log -rw-r--r-- 1 root root 1385 2月 3 07:21 2014 /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
それではfluetndの設定です。GlusterFSのログファイルを読み込み、ログサーバに転送します。
$ sudo vi /etc/td-agent/td-agent.conf <source> type glusterfs_log path /var/log/glusterfs/etc-glusterfs-glusterd.vol.log pos_file /var/log/td-agent/etc-glusterfs-glusterd.vol.log.pos tag glusterfs_log.glusterd format /^(?<message>.*)$/ </source> <match glusterfs_log.**> type forward send_timeout 60s recover_wait 10s heartbeat_interval 1s phi_threshold 8 hard_timeout 60s <server> name logserver host 172.31.10.100 port 24224 weight 60 </server> <secondary> type file path /var/log/td-agent/forward-failed </secondary> </match>
最後にfluentdのサービスを起動します。自動起動の設定もしておきましょう。
$ sudo service td-agent start Starting td-agent: [ OK ] $ sudo chkconfig td-agent on
ログ集積サーバの設定
GlusterFSサーバと同様に、fluentdをインストールします。
$ curl -L http://toolbelt.treasuredata.com/sh/install-redhat.sh | sh
fluentdの設定ファイルを編集し、受信したGlusterFSのログを/var/log/td-agent/glusterdに保存するよう設定します。
<source> type forward port 24224 bind 0.0.0.0 </source> <match glusterfs_log.glusterd> type file path /var/log/td-agent/glusterd </match>
fluentdのサービスを起動します。
$ sudo service td-agent start Starting td-agent: [ OK ] $ sudo chkconfig td-agent on
Security Groupの設定としては、GlusterFSサーバからログサーバに対し24224/tcp、24224/udpが到達出来るようにして下さい。
確認
volume stop VOLしたりすると...こんな感じで/var/log/td-agent/glusterdにログが保存されます!hostnameに複数の名前がありますね!
2014-02-03T11:28:02+00:00 glusterfs_log.glusterd {"date":"2014-02-03","time":"11:28:02","time_usec":"974779","log_level":"I","source_file_name":"glusterd-handler.c","source_line":"866","function_name":"glusterd_handle_cli_get_volume","component_name":"0-glusterd","message":"Received get vol req","hostname":"ip-172-31-27-73"} 2014-02-03T11:28:21+00:00 glusterfs_log.glusterd {"date":"2014-02-03","time":"11:28:21","time_usec":"823180","log_level":"I","source_file_name":"glusterd-handler.c","source_line":"502","function_name":"glusterd_handle_cluster_lock","component_name":"0-glusterd","message":"Received LOCK from uuid: b5fca2b0-d656-4149-8e9e-f29feacefd54","hostname":"ip-172-31-27-73"} 2014-02-03T11:28:21+00:00 glusterfs_log.glusterd {"date":"2014-02-03","time":"11:28:21","time_usec":"823265","log_level":"I","source_file_name":"glusterd-utils.c","source_line":"285","function_name":"glusterd_lock","component_name":"0-glusterd","message":"Cluster lock held by b5fca2b0-d656-4149-8e9e-f29feacefd54","hostname":"ip-172-31-27-73"} 2014-02-03T11:28:21+00:00 glusterfs_log.glusterd {"date":"2014-02-03","time":"11:28:21","time_usec":"823329","log_level":"I","source_file_name":"glusterd-handler.c","source_line":"1322","function_name":"glusterd_op_lock_send_resp","component_name":"0-glusterd","message":"Responded, ret: 0","hostname":"ip-172-31-27-73"} 2014-02-03T11:28:21+00:00 glusterfs_log.glusterd {"date":"2014-02-03","time":"11:28:21","time_usec":"827426","log_level":"I","source_file_name":"glusterd-handler.c","source_line":"1366","function_name":"glusterd_handle_cluster_unlock","component_name":"0-glusterd","message":"Received UNLOCK from uuid: b5fca2b0-d656-4149-8e9e-f29feacefd54","hostname":"ip-172-31-27-73"} 2014-02-03T11:28:21+00:00 glusterfs_log.glusterd {"date":"2014-02-03","time":"11:28:21","time_usec":"827486","log_level":"I","source_file_name":"glusterd-handler.c","source_line":"1342","function_name":"glusterd_op_unlock_send_resp","component_name":"0-glusterd","message":"Responded to unlock, ret: 0","hostname":"ip-172-31-27-73"} 2014-02-03T11:28:16+00:00 glusterfs_log.glusterd {"date":"2014-02-03","time":"11:28:16","time_usec":"155641","log_level":"I","source_file_name":"glusterd-handler.c","source_line":"866","function_name":"glusterd_handle_cli_get_volume","component_name":"0-glusterd","message":"Received get vol req","hostname":"ip-172-31-12-220"} 2014-02-03T11:28:21+00:00 glusterfs_log.glusterd {"date":"2014-02-03","time":"11:28:21","time_usec":"820585","log_level":"I","source_file_name":"glusterd-volume-ops.c","source_line":"354","function_name":"glusterd_handle_cli_stop_volume","component_name":"0-glusterd","message":"Received stop vol reqfor volume gVol0","hostname":"ip-172-31-12-220"} 2014-02-03T11:28:21+00:00 glusterfs_log.glusterd {"date":"2014-02-03","time":"11:28:21","time_usec":"820676","log_level":"I","source_file_name":"glusterd-utils.c","source_line":"285","function_name":"glusterd_lock","component_name":"0-glusterd","message":"Cluster lock held by b5fca2b0-d656-4149-8e9e-f29feacefd54","hostname":"ip-172-31-12-220"} 2014-02-03T11:28:21+00:00 glusterfs_log.glusterd {"date":"2014-02-03","time":"11:28:21","time_usec":"820698","log_level":"I","source_file_name":"glusterd-handler.c","source_line":"463","function_name":"glusterd_op_txn_begin","component_name":"0-management","message":"Acquired local lock","hostname":"ip-172-31-12-220"} 2014-02-03T11:28:21+00:00 glusterfs_log.glusterd {"date":"2014-02-03","time":"11:28:21","time_usec":"823246","log_level":"I","source_file_name":"glusterd-rpc-ops.c","source_line":"548","function_name":"glusterd3_1_cluster_lock_cbk","component_name":"0-glusterd","message":"Received ACC from uuid: 83c8d48a-071e-4934-920f-b0fb8c0acdf4","hostname":"ip-172-31-12-220"} 2014-02-03T11:28:21+00:00 glusterfs_log.glusterd {"date":"2014-02-03","time":"11:28:21","time_usec":"825009","log_level":"E","source_file_name":"glusterd-volume-ops.c","source_line":"909","function_name":"glusterd_op_stage_stop_volume","component_name":"0-","message":"Volume gVol0 has not been started","hostname":"ip-172-31-12-220"} 2014-02-03T11:28:21+00:00 glusterfs_log.glusterd {"date":"2014-02-03","time":"11:28:21","time_usec":"825047","log_level":"E","source_file_name":"glusterd-op-sm.c","source_line":"1999","function_name":"glusterd_op_ac_send_stage_op","component_name":"0-","message":"Staging failed","hostname":"ip-172-31-12-220"} 2014-02-03T11:28:21+00:00 glusterfs_log.glusterd {"date":"2014-02-03","time":"11:28:21","time_usec":"825080","log_level":"I","source_file_name":"glusterd-op-sm.c","source_line":"2039","function_name":"glusterd_op_ac_send_stage_op","component_name":"0-glusterd","message":"Sent op req to 0 peers","hostname":"ip-172-31-12-220"} 2014-02-03T11:28:21+00:00 glusterfs_log.glusterd {"date":"2014-02-03","time":"11:28:21","time_usec":"827376","log_level":"I","source_file_name":"glusterd-rpc-ops.c","source_line":"607","function_name":"glusterd3_1_cluster_unlock_cbk","component_name":"0-glusterd","message":"Received ACC from uuid: 83c8d48a-071e-4934-920f-b0fb8c0acdf4","hostname":"ip-172-31-12-220"} 2014-02-03T11:28:21+00:00 glusterfs_log.glusterd {"date":"2014-02-03","time":"11:28:21","time_usec":"827416","log_level":"I","source_file_name":"glusterd-op-sm.c","source_line":"2653","function_name":"glusterd_op_txn_complete","component_name":"0-glusterd","message":"Cleared local lock","hostname":"ip-172-31-12-220"}
まとめ
運用管理の観点ではログの一元管理とバックアップはとても重要です。fluentdはそれを実現するためのベストなソフトウェアだと思います。今回はそのままファイルに保存しましたが、そのままElasticSearchあたりに突っ込んで検索できるようにしたりとか、夢が広がりますね。